-
Notifications
You must be signed in to change notification settings - Fork 228
feat(compass-collection): Schema Analysis Redux Integration for Collection Plugin CLOUDP-333846 #7177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
interface SchemaAnalysisFinishedAction { | ||
type: CollectionActions.SchemaAnalysisFinished; | ||
schemaAnalysis: SchemaAnalysis; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actions shouldn't be controlling the state, reducer should, passing the whole state and just assigning it in the reducer is an antipattern that should be avoided, don't do this. Clearly separate what is the action payload and what is the state that reducer will derive based on the action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this. Wondering if it's worth it to move this to its own reducer. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty self contained, so can be its own slice, yeah
@jcobis @gribnoysup Heads up - Adding In order to remove the cyclical dependency, I think we'd have to remove the Since we only need schema_depth from this function, I'm considering extracting/duplicating just part of this calculation in |
}, | ||
A | ||
>; | ||
|
||
export enum SchemaAnalysisStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] fyi there's another thread that advocates the compass repos's preference for unions over enums in #7181 (comment). I personally prefer them over enums because they enable type algebra like creating intersection/union types, and enums add runtime artifacts that union types do not
ERROR = 'error', | ||
} | ||
|
||
type SchemaAnalysis = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| null
in various fields is a sign that we can apply discriminated fields to get type narrowing benefits.
You can break this into a type for each Status's state
type SchemaAnalysis = SchemaAnalysisError | SchemaAnalysisInitial | SchemaAnalysisAnalyzing | SchemaAnalysisCompleted`
// example. notice there's no `error` field or `| null`
type SchemaAnalysisCompleted = {
schema: Schema;
status: SchemaAnalysisStatus;
sampleDocument: Document;
schemaMetaData: { ... };
}
// etc
each type shares the status
but the status
helps the type checker see what fields exist based on the status
The TypeScript handbook has an example here
This should simplify the action action+reducer code as well by removing entries for nullable fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very helpful! I'm gonna move these types to their own file as well!
return; | ||
} | ||
|
||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume it's possible to go to ANALYZING directly from COMPLETED or FAILED as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's possible. For now, this function will be invoked when collection metadata is fetched successfully.
sampleDocument: sampleDocuments[0] ?? null, | ||
schemaMetadata, | ||
}); | ||
} catch (err: any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there documented errors that can be caught here? Would help devs looking at stack traces narrow the source of the issue earlier, especially because there's multiple await
calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I don't think this is extensive, but I copied the error handling logic from schema-analysis-reducer
to this file and added a SchemaAnalysis specific error type. See getErrorDetails. This would help with error handling in the UI as well.
} | ||
|
||
let schemaMetadata = null; | ||
if (schema !== null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should happen if schema
stays null? Seems like an edge case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! I was assuming a null
schema
would get handled in the UI code once the analysis completed. If we don't have a schema
, our schema analysis is incomplete.
Should we consider this a "failure" state? Something else? @jcobis Do you have thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the return type of schemaAccessor.getInternalSchema()
is non-null Schema
. This may be an unnecessary check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think it's an unnecessary check. We probably don't need to check if (schemaAccessor)
either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the analysis fails I think the exception would be thrown by getInternalSchema()
, which would be caught by the surrounding try/catch block
@ncarbon I don't think you can easily remove the dependency here, compass-collection is the place that keeps the source of truth for this type, you can't really move it somewhere else without creating an awkward intermediate place for it and I think I'd prefer we avoid doing that 🙂 I think if you want to just implement the depth counting somewhere near the code that does the check that's fine (in your case you don't even need to count the full depth, right? only that it doesn't hit a relatively small limit). You should also consider moving this code to mongodb-schema package that is already this one place where we accumulate all the shared code related to schema analysis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work Nataly!
const sampleCursor = dataService.sampleCursor( | ||
namespace, | ||
samplingOptions, | ||
driverOptions, | ||
{ | ||
fallbackReadPreference: 'secondaryPreferred', | ||
} | ||
); | ||
const sampleDocuments: Array<Document> = await sampleCursor.toArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use .sample
if you're immediately iterating the whole cursor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces schema analysis functionality to the Compass Collection Tab, automatically analyzing a collection's schema when it loads (excluding read-only and time-series collections). The implementation includes new Redux state management for schema analysis with comprehensive status tracking and error handling.
- Adds new Redux state management for schema analysis with status tracking (initial, analyzing, completed, error)
- Implements automatic schema analysis on collection load with proper filtering for read-only and time-series collections
- Introduces schema depth calculation utility for analyzing nested document structures
Reviewed Changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
File | Description |
---|---|
collection-tab.ts | Adds schema analysis state initialization and triggers analysis on metadata load |
collection-tab.spec.ts | Updates tests to mock schema analysis and verify conditional execution |
schema-analysis-types.ts | Defines TypeScript types for schema analysis state management |
collection-tab.ts (modules) | Implements core schema analysis thunk action with document sampling and error handling |
calculate-schema-depth.ts | Utility function for calculating maximum nesting depth in schema structures |
calculate-schema-depth.spec.ts | Comprehensive test suite for schema depth calculation |
package.json | Adds required dependencies for schema analysis functionality |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diffs for my last feedback and additions around calculate-schema-depth
are good for me
Description
This pull request introduces schema analysis functionality to the Compass Collection Tab, enabling the application to automatically analyze a collection's schema when it is loaded (unless the collection is read-only or time-series). The changes include new Redux state and actions for schema analysis, and a thunk to perform the analysis.
Motivation
Schema Analysis Feature:
SchemaAnalysis
state toCollectionState
, with status tracking (INITIAL
,ANALYZING
,COMPLETED
,ERROR
), schema data, sample document, schema metadata, and error information.analyzeCollectionSchema
that samples documents, analyzes the schema, calculates metadata (like max nesting depth and validation rules), and updates the Redux state accordingly.Store Initialization and Integration:
schemaAnalysis
state and inject required dependencies (logger, preferences, abort controller) for schema analysis.Testing Enhancements:
Checklist
Motivation and Context
To extract schema information (types and validation rules) that will be used in the upcoming mock data generator feature.
Open Questions
Dependents
Types of changes